智能论文笔记

DRESS: Dynamic REal-time Sparse Subnets

Zhongnan Qu , Syed Shakib Sarwar , Xin Dong , Yuecheng Li , Ekin Sumbul , Barbara De Salvo

分类：计算机视觉 | 机器学习

2022-07-01

边缘设备上有限且动态的资源激励我们部署优化的深神经网络，该网络可以调整其子网络以适应不同的资源约束。但是，现有作品通常通过在手工制作的采样空间中搜索不同的网络体系结构来构建子网络，这不仅可以导致低标准的性能，而且可能导致设备上的重新配置开销。在本文中，我们提出了一种新颖的培训算法，动态的实时稀疏子网（着装）。着装通过基于行的非结构化稀疏度从相同的骨干网络采样多个子网络，并与加权损失并联训练这些子网络。着装还利用包括参数重复使用和基于行的细粒抽样在内的策略，以进行有效的存储消耗和有效的机上适应。公共视觉数据集的广泛实验表明，与最先进的子网络相比，着装的准确性明显更高。

translated by 谷歌翻译

High-fidelity Direct Contrast Synthesis from Magnetic Resonance Fingerprinting

Ke Wang , Mariya Doneva , Jakob Meineke , Thomas Amthor , Ekin Karasan , Fei Tan , Jonathan I. Tamir , Stella X. Yu , Michael Lustig

分类：计算机视觉

2022-12-21

Magnetic Resonance Fingerprinting (MRF) is an efficient quantitative MRI technique that can extract important tissue and system parameters such as T1, T2, B0, and B1 from a single scan. This property also makes it attractive for retrospectively synthesizing contrast-weighted images. In general, contrast-weighted images like T1-weighted, T2-weighted, etc., can be synthesized directly from parameter maps through spin-dynamics simulation (i.e., Bloch or Extended Phase Graph models). However, these approaches often exhibit artifacts due to imperfections in the mapping, the sequence modeling, and the data acquisition. Here we propose a supervised learning-based method that directly synthesizes contrast-weighted images from the MRF data without going through the quantitative mapping and spin-dynamics simulation. To implement our direct contrast synthesis (DCS) method, we deploy a conditional Generative Adversarial Network (GAN) framework and propose a multi-branch U-Net as the generator. The input MRF data are used to directly synthesize T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images through supervised training on paired MRF and target spin echo-based contrast-weighted scans. In-vivo experiments demonstrate excellent image quality compared to simulation-based contrast synthesis and previous DCS methods, both visually as well as by quantitative metrics. We also demonstrate cases where our trained model is able to mitigate in-flow and spiral off-resonance artifacts that are typically seen in MRF reconstructions and thus more faithfully represent conventional spin echo-based contrast-weighted images.

translated by 谷歌翻译

Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning in Remote Sensing

Gencer Sumbul , Begüm Demir

分类：计算机视觉

2022-12-02

The development of deep learning based image representation learning (IRL) methods has attracted great attention in the context of remote sensing (RS) image understanding. Most of these methods require the availability of a high quantity and quality of annotated training images, which can be time-consuming and costly to gather. To reduce labeling costs, publicly available thematic maps, automatic labeling procedures or crowdsourced data can be used. However, such approaches increase the risk of including label noise in training data. It may result in overfitting on noisy labels when discriminative reasoning is employed as in most of the existing methods. This leads to sub-optimal learning procedures, and thus inaccurate characterization of RS images. In this paper, as a first time in RS, we introduce a generative reasoning integrated label noise robust representation learning (GRID) approach. GRID aims to model the complementary characteristics of discriminative and generative reasoning for IRL under noisy labels. To this end, we first integrate generative reasoning into discriminative reasoning through a variational autoencoder. This allows our approach to automatically detect training samples with noisy labels. Then, through our label noise robust hybrid representation learning strategy, GRID adjusts the whole learning procedure for IRL of these samples through generative reasoning and that of the other samples through discriminative reasoning. Our approach learns discriminative image representations while preventing interference of noisy labels during training independently from the IRL method. Thus, unlike the existing methods, GRID does not depend on the type of annotation, label noise, neural network, loss or learning task, and thus can be utilized for various RS image understanding problems. Experimental results show the effectiveness of GRID compared to state-of-the-art methods.

translated by 谷歌翻译

What learning algorithm is in-context learning? Investigations with linear models

Ekin Akyürek , Dale Schuurmans , Jacob Andreas , Tengyu Ma , Denny Zhou

分类：机器学习 | 自然语言处理

2022-11-28

Neural sequence models, especially transformers, exhibit a remarkable capacity for in-context learning. They can construct new predictors from sequences of labeled examples $(x, f(x))$ presented in the input without further parameter updates. We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly, by encoding smaller models in their activations, and updating these implicit models as new examples appear in the context. Using linear regression as a prototypical problem, we offer three sources of evidence for this hypothesis. First, we prove by construction that transformers can implement learning algorithms for linear models based on gradient descent and closed-form ridge regression. Second, we show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression, transitioning between different predictors as transformer depth and dataset noise vary, and converging to Bayesian estimators for large widths and depths. Third, we present preliminary evidence that in-context learners share algorithmic features with these predictors: learners' late layers non-linearly encode weight vectors and moment matrices. These results suggest that in-context learning is understandable in algorithmic terms, and that (at least in the linear case) learners may rediscover standard estimation algorithms. Code and reference implementations are released at https://github.com/ekinakyurek/google-research/blob/master/incontext.

translated by 谷歌翻译

Compositional Semantic Parsing with Large Language Models

Andrew Drozdov , Nathanael Schärli , Ekin Akyuürek , Nathan Scales , Xinying Song , Xinyun Chen , Olivier Bousquet , Denny Zhou

分类：自然语言处理 | 人工智能

2022-09-29

当呈现新任务时，人类可以在构图上推理。先前的研究表明，适当的提示技术使大型语言模型（LLM）能够解决人工构图概括任务，例如扫描。在这项工作中，我们在更现实的语义解析任务中确定了更大的词汇，并完善这些提示技术来解决这些挑战。我们的最佳方法是基于最小的提示：它使用基于提示的句法解析分解问题，然后使用此分解来选择适当的示例并顺序生成语义分析。这种方法使我们能够为CFQ设置新的最新技术，同时仅需要传统方法使用的培训数据的1％。由于我们的方法的一般性，我们希望类似的努力将在其他任务和领域中带来新的结果，尤其是对于知识密集型应用程序。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions in the Wild

Chirag Raman , Jose Vargas-Quiros , Stephanie Tan , Ekin Gedik , Ashraful Islam , Hayley Hung

分类：机器学习

2022-05-10

由于几个因素之间的微妙权衡：参与者的隐私，生态有效性，数据保真度和后勤开销，记录野外未脚本人类互动的动态是具有挑战性的。为了解决这些问题，在社区精神上为社区的“数据集”之后，我们提出了会议生活实验室（Conflab）：一个新的概念，用于多模式多模式数据收集，野生野外社交对话。对于此处描述的Conflab的首次实例化，我们在一次大型国际会议上组织了现实生活中的专业网络活动。该数据集涉及48个会议参与者，捕捉了地位，熟人和网络动机的各种组合。我们的捕获设置改善了先前野外数据集的数据保真度，同时保留隐私敏感性：从非侵入性的架空视图中获得8个视频（1920x1080，60 fps），并具有定制的可穿戴传感器，并带有车载记录（完整9） - 轴IMU），具有隐私性的低频音频（1250 Hz）和基于蓝牙的接近度。此外，我们开发了用于采集时分布式硬件同步的自定义解决方案，并以高采样速率对身体关键点和动作进行了及时的连续注释。我们的基准测试展示了与野外隐私保护社交数据分析有关的一些开放研究任务：从高架摄像头视图，基于骨架的No-Audio扬声器检测和F-Formation检测中的关键点检测。

translated by 谷歌翻译

Do better ImageNet classifiers assess perceptual similarity better?

Manoj Kumar , Neil Houlsby , Nal Kalchbrenner , Ekin D. Cubuk

分类：计算机视觉

2022-03-09

图像之间的感知距离在预训练的深度特征的空间中测量，在评估图像相似性方面优于先前的低级，基于像素的指标。虽然众所周知，较旧模型（例如Alexnet和VGG）捕获感知相似性的功能却较少，但研究了现代和更准确的模型。在本文中，我们提出了一项大规模的经验研究，以评估成像网分类器在感知相似性方面的表现。首先，我们观察到成像网的精度与现代网络（例如重置，有效网络和视觉变压器）的感知得分之间的反相关性：更好的分类器达到了较差的感知得分。然后，我们在不同的深度，宽度，训练步骤，重量衰减，标签平滑和辍学时检查了成像网的精度/感知分数关系。更高的精度将感知得分提高到一定点，但是我们在中高精度方面发现了精度和感知得分之间的帕累托前沿。我们使用许多合理的假设，例如失真不变性，空间频率灵敏度和替代感知函数，进一步探索这种关系。有趣的是，我们发现仅在Imagenet上接受少于5个时代训练的浅重新收集和重新注册，其新兴的感知得分与直接受到监督的人类感知判断直接训练的先前最佳网络相匹配。

translated by 谷歌翻译

A Novel Self-Supervised Cross-Modal Image Retrieval Method In Remote Sensing

Gencer Sumbul , Markus Müller , Begüm Demir

分类：计算机视觉

2022-02-23

由于多模式遥感（RS）图像档案的可用性，最重要的研究主题之一是开发跨模式RS图像检索（CM-RSIR）方法，该方法可以在不同模态上搜索语义上相似的图像。现有的CM-RSIR方法需要提供高质量和数量的带注释的培训图像。在操作方案中，收集足够数量的可靠标记图像是耗时，复杂且昂贵的，并且可能会显着影响CM-RSIR的最终准确性。在本文中，我们介绍了一种新颖的自我监督的CM-RSIR方法，其目的是：i）以自我监督的方式模拟不同方式之间的相互信息； ii）保留彼此相似的模态特异性特征空间的分布； iii）在每种模式中定义最相似的图像，而无需任何带注释的训练图像。为此，我们提出了一个新的目标，其中包括同时同时使用的三个损失函数：i）最大化不同模态的共同信息以保存模式间相似性； ii）最小化多模式图像元素的角度距离，以消除模式间差异； iii）增加每种模式中最相似图像的余弦相似性，以表征模式内相似性。实验结果表明，与最新方法相比，该方法的有效性。该方法的代码可在https://git.tu-berlin.de/rsim/ss-cm-rsir上公开获得。

translated by 谷歌翻译

Deep Metric Learning-Based Semi-Supervised Regression With Alternate Learning

Adina Zell , Gencer Sumbul , Begüm Demir

分类：计算机视觉 | 机器学习

2022-02-23

本文介绍了一种基于深度度量学习的新型半监督回归（DML-S2R）方法，以解决参数估计问题。提出的DML-S2R方法旨在减轻标记样品不足的问题，而无需收集任何具有目标值的其他样本。为此，它由两个主要步骤组成：i）具有稀缺标记的数据的成对相似性建模； ii）基于三胞胎的度量学习，并具有丰富的未标记数据。第一步旨在通过使用少量标记的样品对成对样品相似性进行建模。这是通过估计具有暹罗神经网络（SNN）标记样品的目标值差异来实现的。第二步旨在学习一个基于三重态的度量空间（其中相似的样品彼此接近，并且相差样本彼此相距甚远），当时标记的样品数量不足。这是通过采用第一步的SNN来实现的，用于基于三重态的深度度量学习，不仅利用了标记的样品，而且还可以利用未标记的样本。对于DML-S2R的端到端培训，我们研究了这两个步骤的替代学习策略。由于这种策略，每个步骤中的编码信息成为另一个步骤学习阶段的指导。实验结果证实了DML-S2R与最先进的半监督回归方法相比的成功。该方法的代码可在https://git.tu-berlin.de/rsim/dml-s2r上公开获得。

translated by 谷歌翻译